A new method for finding generalized frequent itemsets in generalized association rule mining
نویسندگان
چکیده
Generalized association rule mining is an extension of traditional association rule mining to discover more informative rules, given a taxonomy. In this paper, we describe a formal framework for the problem of mining generalized association rules. In the framework, The subset-superset and the parent-child relationships among generalized itemsets are introduced to present the different views of generalized itemsets, i.e. the lattice of generalized itemsets and the taxonomies of k-generalized itemsets ,respectively. We present an optimization technique to reduce the time consuming by applying two constraints each of which corresponds to each view of generalized itemsets. In the mining process, a new set enumeration algorithm, named SET, that utilizes these constraints to fasten mining all generalized frequent itemsets is proposed. By experiments on synthetic data, the results show that SET outperforms the current most efficient algorithm, Prutax, by an order of magnitude or more.
منابع مشابه
Fast Algorithms for Mining Generalized Frequent Patterns of Generalized Association Rules
Mining generalized frequent patterns of generalized association rules is an important process in knowledge discovery system. In this paper, we propose a new approach for efficiently mining all frequent patterns using a novel set enumeration algorithm with two types of constraints on two generalized itemset relationships, called subset-superset and ancestor-descendant constraints. We also show a...
متن کاملFast Algorithm for Mining Generalized Association Rules
In this paper, we present a new algorithm for mining generalized association rules. We develop the algorithm which scans database one time only and use Tidset to compute the support of generalized itemset faster. A tree structure called GIT-tree, an extension of IT-tree, is developed to store database for mining frequent itemsets from hierarchical database. Our algorithm is often faster than MM...
متن کاملData sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کاملMining Generalized Closed Frequent Itemsets of Generalized Association Rules
In the area of knowledge discovery in databases, the generalized association rule mining is an extension from the traditional association rule mining by given a database and taxonomy over the items in database. More initiative and informative knowledge can be discovered. In this work, we propose a novel approach of generalized closed itemsets. A smaller set of generalized closed itemsets can be...
متن کاملA New Algorithm for Faster Mining of Generalized Association Rules
Generalized association rules are a very important extension of boolean association rules, but with current approaches mining generalized rules is computationally very expensive. Especially when considering the rule generation as being part of an interactive KDD-process this becomes annoying. In this paper we discuss strengths and weaknesses of known approaches to generate frequent itemsets. Ba...
متن کامل